Warehousing Structured and Unstructured Data for Data

نویسندگان

  • Vasant Honavar
  • Tom Barta
چکیده

More data, especially unstructured data, is available to users than ever. There is so much data available that it is diicult for users to make use of their data in its raw form. To handle the diversity of data types, we have designed and prototyped a multidatabase/warehouse system. The system has been especially designed to facilitate the interaction of structured and unstructured data. The system makes use of object oriented views. The main features of the view mechanism, especially as they relate to textual documents, are presented in the paper. The system is designed to take target documents either from large repositories or from the Web. Issues for both sources of documents are examined in the paper. The paper also looks at how the view approach allows the interaction between the data taken from structured (e.g., relational), semistructured (e.g., object oriented) and unstructured (e.g. text) data sources. The warehouse support provided by the system is brieey examined and the paper concludes by looking at our approach to data mining and how the system will operate in the complete environment.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Text Analytics to Data Warehousing

─ Information hidden or stored in unstructured data can play a critical role in making decisions, understanding and conducting other business functions. Integrating data stored in both structured and unstructured formats can add significant value to an organization. With the extent of development happening in Text Mining and technologies to deal with unstructured and semi structured data like X...

متن کامل

Towards Business Intelligence over Unified Structured and Unstructured Data Using XML

Traditional data warehousing has been very successful in helping business enterprises to make intelligent decisions through declarative analysis of large amount of structured data stored in a relational database. However, not all enterprise data naturally fit into a relational model. Within an enterprise, there are huge amount of unstructured data, such as document content, emails, spreadsheets...

متن کامل

From data warehousing to active information integration systems

Enterprises have gathered operational business information frommultiple structured data sources and stored it in a central repository, called data warehousing, for decision support functionalities and data analysis. The enterprises are now realizing to integrate their entire information sources, including "unstructured" contents, for deeper and richer information analysis. Several applications,...

متن کامل

Towards Complex Data Warehousing : A new approach for integrating and modeling complex data

With the development of Internet, the availability of various types of data (multimedia, data from databases, ) has increased. These data which present different forms and semantics (we name them ”complex data”) may be unstructured, structured, or semi-structured. In order to prepare them for proper analysis, a multidimensional modeling layer is needed. To be efficient in terms of quality of se...

متن کامل

An MAS-Based ETL Approach for Complex Data

In a data warehousing process, the phase of data integration is crucial. Many methods for data integration have been published in the literature. However, with the development of the Internet, the availability of various types of data (images, texts, sounds, videos, databases...) has increased, and structuring such data is a difficult task. We name these data, which may be structured or unstruc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997